Search optimization technique for Domain Specific Parallel Crawler
نویسندگان
چکیده
Architectural framework of World Wide Web is used for accessing linked documents spread out over millions of machines all over the Internet. Web is a system that makes exchange of data on the internet easy and efficient. Due to the exponential growth of web, it has become a challenge to traverse all URLs in the web documents and handle these documents, so it is necessary to optimize the parallelize crawling process. In domain specific parallel crawler different domains are distributed among crawler for getting fast result. The crawler crawls the web periodically to maintain the freshness of repository but due to large amount of data, the relevant information does not update frequently. This paper proposes a novel technique that uses a Selection Factor algorithm for optimizing the search in Domain Specific Parallel Crawler and provide relevant information frequently in repository. Keywords— Search Engine, Parallel Crawler, Domain Specific Parallel Crawler, Selection Factor Algorithm, Page Rank
منابع مشابه
Prioritize the ordering of URL queue in Focused crawler
The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...
متن کاملAn extended model for effective migrating parallel web crawling with domain specific crawling
The size of the internet is large and it had grown enormously search engines are the tools for Web site navigation and search. Search engines maintain indices for web documents and provide search facilities by continuously downloading Web pages for processing. This process of downloading web pages is known as web crawling. In this paper we propose the architecture for Effective Migrating Parall...
متن کاملAn Extended Model for Effective Migrating Parallel Web Crawling with Domain Specific and Incremental Crawling
The size of the internet is large and it had grown enormously search engines are the tools for Web site navigation and search. Search engines maintain indices for web documents and provide search facilities by continuously downloading Web pages for processing. This process of downloading web pages is known as web crawling. In this paper we propose the architecture for Effective Migrating Parall...
متن کاملAn Improved Technique for Web Page Classification in Respect of Domain Specific Search
A domain specific crawler, as diverse from a general web search engine, focuses on a specific segment of web content. They are also called vertical or topical search engines. Common vertical search engines are meant for shopping, automotive industry, legal information, medical information, scholarly literature, and travel. Examples of vertical search engines are Trulia. com, Mocavo. com and Yel...
متن کاملA Novel Approach to Integrated Search Information Retrieval Technique for Hidden Web for Domain Specific Crawling
The traditional web crawlers retrieve contents from only the “Surface web” and are unable to crawl through the hidden portion of the Web containing high quality information which is dynamically generated through querying databases when the queries are submitted through a search interface. For Hidden web, most of the published research has been done to identify/detect such searchable forms and m...
متن کامل